On a pitch alteration technique in excited cepstral spectrum for high quality TTS

نویسندگان

  • JongDeuk Kim
  • SeongJoon Baek
  • Myung-Jin Bae
چکیده

In the area of the speech synthesis techniques, the waveform coding methods maintain the intelligibility and naturalness of synthetic speech. In order to apply the waveform coding or hybrid coding techniques to synthesis by rule, we must be able to alter the pitches of synthetic speech. In this paper, WC propose a new pitch alteration method that minimizes the spectrum distortion by using the behavior of cepstrum. This method splits the spectrum of speech signal into excitation spectrum and formant spectrum and transforms the excitation spectrum into cepstrum domain. The pitch of excitation cepstrum is altered by zero insertion or zero deletion and the pitch altered spectrum is reconstructed in spectrum domain. As a result of performance test, the average spectrum distortion was below 2.297r while that of conventional method is 2.47%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Low Resource TTS Synthesis Based on Cepstral Filter with Phase Randomized Excitation

In this paper we present the acoustic synthesis of a low resource Text-To-Speech (TTS) system based on a 7th order cepstral filter. The excitation signal is designed in frequency domain by a two parameter model. This model is able to generate the excitation signal for both, voiced and unvoiced segments. The sets of filter coefficients represent the speech units and are stored in a compressed fo...

متن کامل

Non-filter waveform generation from cepstrum using spectral phase reconstruction

This paper discusses non-filter waveform generation from cepstral features using spectral phase reconstruction as an alternative method to replace the conventional source-filter model in text-to-speech (TTS) systems. As the primary purpose of the use of filters is considered as producing a waveform from the desired spectrum shape, one possible alternative of the sourcefilter framework is to dir...

متن کامل

Sinusoidal model parameterization for HMM-based TTS system-Interspeech2010_v2.1.1

A sinusoidal representation of speech is an alternative to the source-filter model. It is widely used in speech coding and unit-selection TTS, but is less common in statistical TTS frameworks. In this work we utilize Regularized Cepstral Coefficients (RCC) estimated in mel-frequency scale for amplitude spectrum envelope modeling within an HMM-based TTS platform. Improved subjective quality for ...

متن کامل

Assigning suitable phrasal tones and pitch accents by sensing affective information from text to synthesize human-like speech

We have carried out several perceptual and objective experiments that show that the present Text-To-Speech (TTS) systems are weak in the relevance of prosody and segmental spectrum in the characterization and expression of emotions. Since it is known that the emotional state of a speaker usually alters the way s/he speaks, the TTS systems need to be improved to generate human-like pitch accents...

متن کامل

The new version of the ROMVOX text-to-speech synthesis system based on a hybrid time domain-LPC synthesis technique

Through the years we developed several TTS systems for the Romanian language, each of them presenting some advantages and disadvantages [2]. Taking into account that waveform coding (time domain) methods assures a maximum level of intelligibility and naturalness of the synthesized speech, and that prosodic effects superimposing requires the alteration of pitch (frequency domain), we developed a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998